Estimation of Speech Features of Glottal Excitation by Nonlinear Prediction
نویسندگان
چکیده
Analysis of speech signals can be performed with the aid of linear or nonlinear statistics using appropriate prediction algorithms. In this contribution, speech features are treated using the results of a nonlinear prediction based on Volterra series. Features are investigated representing the prediction gain by nonlinear statistics and representing individual coefficients of the nonlinear components. The features are estimated quasi continuously resulting in a feature signal. Additionally, to obtain features which are highly sensitive to segmentation shifting, an asymmetric window function is integrated into the prediction algorithm. The analyses of speech signals show that the estimated features correlate with the glottal pulses. Furthermore, the investigations show that using the first individual nonlinear coefficient as a feature is advantageous over using the prediction gain.
منابع مشابه
Using Text and Acoustic Features in Predicting Glottal Excitation Waveforms for Parametric Speech Synthesis with Recurrent Neural Networks
This work studies the use of deep learning methods to directly model glottal excitation waveforms from context dependent text features in a text-to-speech synthesis system. Glottal vocoding is integrated into a deep neural network-based text-to-speech framework where text and acoustic features can be flexibly used as both network inputs or outputs. Long short-term memory recurrent neural networ...
متن کاملCombination of Linear Prediction and Phase Decomposition for Glottal Source Analysis on Voiced Speech
Some glottal analysis approaches based upon linear prediction or complex cepstrum approaches have been proved to be effective to estimate glottal source from real speech utterances. We propose a new approach employing both an all-pole oddorder linear prediction to provide a coarse estimation and phase decomposition based causality/anti-causality separation to generate further refinements. The o...
متن کاملThe GlottHMM Entry for Blizzard Challenge 2011: Utilizing Source Unit Selection in HMM-Based Speech Synthesis for Improved Excitation Generation
This paper describes the GlottHMM speech synthesis system for Blizzard Challenge 2011. GlottHMM is a hidden Markov model (HMM) based speech synthesis system that utilizes glottal inverse filtering for separating the vocal tract and the glottal source from speech signal and models both components individually. In this year’s entry, stabilized weighted linear prediction (SWLP) is used to yield mo...
متن کاملEffect of Tongue Tip Trilling on the Glottal Excitation Source
Recent studies have indicated changes in the glottal excitation source characteristics apart from vocal tract resonances due to tongue tip trilling. In this paper we study the significance of changing vocal tract system and the associated glottal excitation source characteristics due to trilling, from perception point of view. These studies are made by generating speech signal by either retaini...
متن کاملEpoch-based analysis of speech signals
Speech analysis is traditionally performed using short-time analysis to extract features in time and frequency domains. The window size for the analysis is fixed somewhat arbitrarily, mainly to account for the time varying vocal tract system during production. However, speech in its primary mode of excitation is produced due to impulse-like excitation in each glottal cycle. Anchoring the speech...
متن کامل